Identification of Catechins’ Binding Sites in Monomeric Aβ42 through Ensemble Docking and MD Simulations

The assembly of the amyloid-β peptide (Aβ) into toxic oligomers and fibrils is associated with Alzheimer’s disease and dementia. Therefore, disrupting amyloid assembly by direct targeting of the Aβ monomeric form with small molecules or antibodies is a promising therapeutic strategy. However, given the dynamic nature of Aβ, standard computational tools cannot be easily applied for high-throughput structure-based virtual screening in drug discovery projects. In the current study, we propose a computational pipeline—in the framework of the ensemble docking strategy—to identify catechins’ binding sites in monomeric Aβ42. It is shown that both hydrophobic aromatic interactions and hydrogen bonding are crucial for the binding of catechins to Aβ42. Additionally, it has been found that all the studied ligands, especially EGCG, can act as potent inhibitors against amyloid aggregation by blocking the central hydrophobic region of Aβ. Our findings are evaluated and confirmed with multi-microsecond MD simulations. Finally, it is suggested that our proposed pipeline, with low computational cost in comparison with MD simulations, is a suitable approach for the virtual screening of ligand libraries against Aβ.


Introduction
Intrinsically disordered proteins (IDPs) are very flexible biomolecules without a welldefined folded structure and typically have important roles in biological processes, in particular in cellular signaling and gene regulation [1][2][3][4]. Under certain conditions, some IDPs may aggregate into highly toxic oligomers. These oligomers are associated with a wide range of serious human diseases such as cancer, neurodegenerative diseases, autoimmune disorders, cardiovascular disease, and type II diabetes [4][5][6][7][8][9]. Thus, preventing or reducing aggregation of the IDPs involved in such diseases is as an effective therapeutic strategy.
Finding aggregation inhibitors and direct targeting of monomeric IDPs via small molecules is a very active area of research and a wide variety of computational techniques have been applied, yet there are many inherent difficulties [12][13][14]27,[39][40][41][42][43][44][45][46]. For example, most docking algorithms employ the flexible ligand and rigid receptor paradigm [47][48][49]. However, IDPs display high conformational heterogeneity, and ligand binding causes large structural changes in the IDP conformations. To circumvent this problem, heterogeneous conformational ensembles of IDPs have been used for docking studies [27,38,42,50,51]. Nevertheless, challenges associated with generating and choosing a set of suitable conformations for docking still remain. It will also be interesting to see how and if Alphafold [52] will change the situation as it has already been applied to non-IDP-related binding problems [53], but IDPs appear to be a challenge even for Alphafold [54].
Several methods have been proposed for efficient sampling of IDP's conformational space and constructing a representative conformational ensemble. Examples include replica-exchange- [55][56][57][58] and metadynamics-based methods [59][60][61], diffusion map approaches [62,63] and Markov state modeling [64][65][66][67]. For comparisons between the methods, see for example [43,[68][69][70]. One alternative for effective exploration of the conformational space is the use of multiple conventional MD trajectories (replicas) with different initial conditions (different velocities or/and different starting configurations). The strategy of choosing the initial conditions controls the effectiveness of this approach and its ability to enhance conformational sampling performance [71][72][73][74][75][76]. We have recently proposed a new efficient algorithm for comprehensive exploring of the conformational space of IDPs, called Blockwise Excursion Sampling (BES) [75,76]. It uses simulated annealing (SA) to find different low-energy states of various regions of conformational space as optimal starting configurations for short conventional MD simulations. In BES, conformational sampling is based on many uncorrelated short MD simulations starting from different points of the accessible phase space. It has been shown that the protocol is successful in generating a diverse conformational library for IDP conformations in agreement with experimental data [75,76].

Docking Analysis
Structural Analysis and Identifying Ligand Binding Site Determination of the important residues that are in close contact with the ligand is very important for the identification of potential inhibitors of Aβ 42 aggregation. The distances between the heavy atoms of the ligand and the residues of Aβ 42 were used to define the binding sites of Aβ 42 . Based on this assumption, a list of binding residues for each selected complex was generated for which at least one heavy atom of the residues falls within the distance cutoff of any ligand heavy atom. To assess the effect of the distance cutoff, five distances were evaluated: 3.0, 3.5, 4.0, 4.5, and 5.0 Å; the distances cutoff from 3.0 to 5.0 Å are commonly used to study ligand-protein binding interactions, such as hydrogen bonds, hydrophobic contacts, and aromatic interactions [22,[95][96][97][98][99][100].
The number of contacts between each ligand and Aβ 42 residues for the set-1 (as defined in Section 3.2) was counted for five different distance cutoffs and tabulated in Tables S1-S5. Here, we only show the results related to the distance cutoff of 5.0 Å in Table 1. It is very important to emphasize that in our structural analysis, none of the conformers selected from molecular docking alone were used to draw qualitative conclusions about the binding site. All the structural analyses in this study are based on the results given in Table 1, which were obtained from the statistical study of all the selected complexes based on ∆∆G binding , which is illustrated in more detail in Section 3.2.
The first observation from the tables is that all the ligands have the most contacts with residues Tyr10, Phe19, and Phe20. Moreover, based on the ranked lists (Tables 1 and S1-S5), other aromatic residues (His13, His14, His6, and Phe4) show a relatively high number of contacts and thus, they contribute to stabilizing the interactions with catechins. Therefore, it seems that these polyphenolic compounds tend to interact with the aromatic residues through stacking and/or T-shaped interactions, as shown in the snapshots of  Another important observation is that the tendency of ligands to interact with Tyr10 correlates with the number of hydroxyl groups on the ligands. This is seen in the data in the tables: EGCG possesses the largest number of hydroxyl groups (8 OH's) and has a greater tendency to interact with Tyr10 than with Phe19 or Phe20, i.e., the number of contacts between EGCG and Tyr10 is larger than those between EGCG and Phe19 or Phe20. Thus, the data appear to imply that, hydrophobic aromatic interactions and hydrogen bonding are both crucial for the binding process. Finally, a comparative look at the tabulated values immediately shows that all ligands have a tendency to associate with the hydrophobic region of Aβ 42 spanning residues from Tyr10 to Phe20. This region contains most of the aromatic residues found in full-length Aβ. This region also encompasses the central hydrophobic region ( 16 KLVFF 20 ) that, based on many experimental and computational studies, is involved in the initiation of amyloid aggregation [11][12][13]22,40,[101][102][103][104][105][106]. Therefore, our docking results show that all studied ligands, especially EGCG, can act as potent inhibitors against amyloid aggregation through blocking the central hydrophobic region. These findings are in agreement with experimental studies [28,37,[78][79][80]83]. been highlighted by colored dashed lines for clarity. In all cases, at least one aromatic residue was in contact with the ligand and in most cases, catechin compounds possessing multiple aromatic rings were capable of interacting with several aromatic residues simultaneously. A comparative look at Figures 1-5 shows that one or two rotatable aromatic rings of the ligands (phenyl ring containing R3 substitution and gallate moiety, as illustrated in Section 3.1) are essential to make the aromatic interactions between the ligands and the aromatic residues. It is reasonable to assume that these flexible rings can be adjusted during the docking process, to maximize the aromatic interaction formation. In conclusion, it seems that these aromatic interactions play an important role in binding to aggregation-prone regions of Aβ42 and are essential for high affinity and binding specificity.  The three aromatic residue hotspots (i.e., Tyr10, Phe19, and Phe20, see Figure 7 for sequence) and interacting aromatic residues with the ligand for some complexes with the highest binding affinities are shown in licorice representation. Favorable aromatic interactions between the ligand and peptide are depicted with orange dashed lines.

Analysis of MD Simulations
In order to evaluate the docking results, microsecond-scale MD simulations were performed on Aβ42-L (L = C, EC, ECG, EGC, EGCG) structures for which the largest binding energies were obtained by the docking procedure. The initial and final MD structures are provided in Figure S1 which show that the ligands maintain their interactions with Aβ42 throughout the simulation. Root mean square deviation (RMSD) of Aβ42 indicates that in all cases the system has reached steady state (Table S6 and Figure S2). The stability of the Aβ42-L complexes is also reflected in the relatively low radius of gyration (~1.0 nm) of Aβ42 (Table S6 and Figure S3). Moreover, the steady solvent accessible surface area (SASA; with < 7% fluctuations) observed for the chains provides further support for the stability and  The docking results suggest that Tyr10, Phe19, and to some extent Phe20, are the main residues involved in ligand binding. Average distances between these residues and the five ligands from the MD simulations are provided in Table 2. In agreement with the results from our docking procedure, with the exception of C for which the distance is ~1.0 nm, the average distance between the ligands and all three residues is ~0.5 nm (Table 2 and Figures S5-S9). The time evolution of these distances and the end-to-end distance for Aβ42 are shown in Figures S5-S9. Table 2. Average Aβ42 end-to-end distance (D1-A42) and distances between EGCG and Y10, F19, F20 residues (the amino acid sequence is provided in the caption of  Inspection of hydrogen bonds between the ligands and Aβ42 reveals that hydrogen bonding plays a more important role in binding of EGCG and EGC than with the other three ligands ( Table 3). Binding of C seems to have the least dependence on the hydrogen bonding among the ligands and relies on the π-π, XH-π (X = C, O) interactions. Average number of Aβ42···ligand, Aβ42···solvent, and ligand···solvent hydrogen bonds are collected in Table 3.  The three aromatic residue hotspots (i.e., Tyr10, Phe19, and Phe20, see Figure 7 for sequence) and interacting aromatic residues with the ligand for some complexes with the highest binding affinities are shown in licorice representation. Favorable aromatic interactions between the ligand and peptide are depicted with orange dashed lines. Figures S10-S14 depict the contributions of Aβ42 residues that involve hydrogen bonding with the ligands. A comparison among the five Aβ42···ligand systems reveals that there are a few residues in the Aβ42 sequence that frequently form hydrogen bonds with the five different catechins (e.g., Glu22, Asp23, and Ala42). These intransient hydrogen bonds alternate between the ligand and the residues, which may suggest that hydrogen bonds have a less important role in the binding of the ligands to Aβ42 compared to stacking interactions. The average number of hydrogen bonds between all the amino acids of the peptide and ligands are depicted in Figure S15. Similar conclusion has been previously made for Aβ42 protofibrils and EGCG, where EGCG was shown to bind to Aβ42 monomer through hydrophobic, π-π stacking, and hydrogen bonds [79,[83][84][85]87,107]. Moreover, same study identifies Asp1, Glu22, and Ala42 residues to form the most hydrogen bonds with EGCG, which except for Asp1, agrees with our observation ( Figure S14). It should be noted, however, that while we have examined disordered structures in this study, Li et al. [107] considered fibrils and 25 EGCG molecules.

Number of Hydrogen Bonds
Finally, the secondary structure of Aβ42 in the presence of catechins was also evaluated and plotted using the VMD visualization software ( Figure 6). As is evident from the figure, turn' and coil' are the main observed secondary structures, followed by the helices and β-strands. In some instances, relatively short-lived fast-alternating helices (α-helix and 310 helix) as well as β-strands are formed. The propensity of certain regions of Aβ42 toward a specific type of secondary structure varies with the ligand type. For example, the N-terminal domain is mainly unstructured (turn and coil) in the presence of ECG, while a The three aromatic residue hotspots (i.e., Tyr10, Phe19, and Phe20, see Figure 7 for sequence) and interacting aromatic residues with the ligand for some complexes with the highest binding affinities are shown in licorice representation. Favorable aromatic interactions between the ligand and peptide are depicted with orange dashed lines.
The distributions of binding affinities (kcal.mol −1 ) of all ligands in set-1 (with the cutoff value of 0.1 kcal.mol −1 ) are shown in Figures 1-5. In addition, the figures show a couple of complexes with the highest binding affinities and for each complex the different types of aromatic interactions, such as π-π, XH-π (X = C, N, O), and lone pair-π interactions between the aromatic rings of the ligands, and the sidechain of aromatic residues have been highlighted by colored dashed lines for clarity. In all cases, at least one aromatic residue was in contact with the ligand and in most cases, catechin compounds possessing multiple aromatic rings were capable of interacting with several aromatic residues simultaneously. A comparative look at Figures 1-5 shows that one or two rotatable aromatic rings of the ligands (phenyl ring containing R3 substitution and gallate moiety, as illustrated in Section 3.1) are essential to make the aromatic interactions between the ligands and the aromatic residues. It is reasonable to assume that these flexible rings can be adjusted during the docking process, to maximize the aromatic interaction formation. In conclusion, it seems that these aromatic interactions play an important role in binding to aggregation-prone regions of Aβ 42 and are essential for high affinity and binding specificity.

Analysis of MD Simulations
In order to evaluate the docking results, microsecond-scale MD simulations were performed on Aβ 42 -L (L = C, EC, ECG, EGC, EGCG) structures for which the largest binding energies were obtained by the docking procedure. The initial and final MD structures are provided in Figure S1 which show that the ligands maintain their interactions with Aβ 42 throughout the simulation. Root mean square deviation (RMSD) of Aβ 42 indicates that in all cases the system has reached steady state (Table S6 and Figure S2). The stability of the Aβ 42 -L complexes is also reflected in the relatively low radius of gyration (~1.0 nm) of Aβ 42 (Table S6 and Figure S3). Moreover, the steady solvent accessible surface area (SASA; with <7% fluctuations) observed for the chains provides further support for the stability and compactness of the system (Table S6 and Figure S4).
The docking results suggest that Tyr10, Phe19, and to some extent Phe20, are the main residues involved in ligand binding. Average distances between these residues and the five ligands from the MD simulations are provided in Table 2. In agreement with the results from our docking procedure, with the exception of C for which the distance is~1.0 nm, the average distance between the ligands and all three residues is~0.5 nm (Table 2 and Figures S5-S9). The time evolution of these distances and the end-to-end distance for Aβ 42 are shown in Figures S5-S9. Table 2. Average Aβ 42 end-to-end distance (D1-A42) and distances between EGCG and Y10, F19, F20 residues (the amino acid sequence is provided in the caption of Figure 7, in Section 3.1). Standard deviations are provided in parentheses. Inspection of hydrogen bonds between the ligands and Aβ 42 reveals that hydrogen bonding plays a more important role in binding of EGCG and EGC than with the other three ligands ( Table 3). Binding of C seems to have the least dependence on the hydrogen bonding among the ligands and relies on the π-π, XH-π (X = C, O) interactions. Average number of Aβ 42 ···ligand, Aβ 42 ···solvent, and ligand···solvent hydrogen bonds are collected in Table 3. Table 3. Average number of hydrogen bonds (donor···acceptor ≤ 0.35 nm and α (∠(hydrogen-donoracceptor) ≤ 30 • ). Standard deviations are provided in parentheses. Figures S10-S14 depict the contributions of Aβ 42 residues that involve hydrogen bonding with the ligands. A comparison among the five Aβ 42 ···ligand systems reveals that there are a few residues in the Aβ 42 sequence that frequently form hydrogen bonds with the five different catechins (e.g., Glu22, Asp23, and Ala42). These intransient hydrogen bonds alternate between the ligand and the residues, which may suggest that hydrogen bonds have a less important role in the binding of the ligands to Aβ 42 compared to stacking interactions. The average number of hydrogen bonds between all the amino acids of the peptide and ligands are depicted in Figure S15. Similar conclusion has been previously made for Aβ 42 protofibrils and EGCG, where EGCG was shown to bind to Aβ 42 monomer through hydrophobic, π-π stacking, and hydrogen bonds [79,[83][84][85]87,107]. Moreover, same study identifies Asp1, Glu22, and Ala42 residues to form the most hydrogen bonds with EGCG, which except for Asp1, agrees with our observation ( Figure S14). It should be noted, however, that while we have examined disordered structures in this study, Li et al. [107] considered fibrils and 25 EGCG molecules.

Number of Hydrogen Bonds (SD)
Finally, the secondary structure of Aβ 42 in the presence of catechins was also evaluated and plotted using the VMD visualization software ( Figure 6). As is evident from the figure, 'turn' and 'coil' are the main observed secondary structures, followed by the helices and β-strands. In some instances, relatively short-lived fast-alternating helices (α-helix and 3 10 helix) as well as β-strands are formed. The propensity of certain regions of Aβ 42 toward a specific type of secondary structure varies with the ligand type. For example, the N-terminal domain is mainly unstructured (turn and coil) in the presence of ECG, while a persistent βstrand appears in the region when EGC is bound. Similarly, residues 14-18 form a β-strand or turn-coil in the presence of EGCG but mainly form short-lived α-helices. As is apparent from the figure, different catechins have different impacts on the secondary structure of Aβ 42 , nevertheless, they all seem to impose local secondary structures across the Aβ 42 chain, potentially impacting the formation and stability of amyloid fibrils. persistent β-strand appears in the region when EGC is bound. Similarly, residues 14-18 form a β-strand or turn-coil in the presence of EGCG but mainly form short-lived α-helices.
As is apparent from the figure, different catechins have different impacts on the secondary structure of Aβ42, nevertheless, they all seem to impose local secondary structures across the Aβ42 chain, potentially impacting the formation and stability of amyloid fibrils.  Turn, β-strand, α-helix, 3 10 helix, random coil, and isolated bridge are shown in teal, yellow, blue, pink, white, and beige, respectively.

Protein and Ligand Library Preparation
A library of Aβ 42 (see Figure 7 for a snapshot and the amino acid sequence) structures was generated using the BES protocol. Briefly, 2000 excursion chains were performed such that all excursion chains were started from a fully extended structure; excursion chain refers to a sequence of MD and SA blocks in the BES protocol, for details and terminology please see [75,76]. It is interesting to note that it has already been shown that the chemical shift calculated for the conformational ensemble has a good agreement with experimental data [75,76].

Protein and Ligand Library Preparation
A library of Aβ42 (see Figure 7 for a snapshot and the amino acid sequen was generated using the BES protocol. Briefly, 2000 excursion chains were pe that all excursion chains were started from a fully extended structure; exc refers to a sequence of MD and SA blocks in the BES protocol, for details and please see [75,76]. It is interesting to note that it has already been shown that shift calculated for the conformational ensemble has a good agreement with data [75,76]. Figure 7. A random structure of Amyloid β42. Aβ42 sequence: 1 DAEFRHDSG 10 Y 20 FAEDVGSNKG 30 AIIGLMVGGV 40 VI 42 A. Asp1 (N-terminus) and Ala42 (C-termin using van der Waals radii. Other residues are represented in licorice. Acidic, basic, p polar amino acids are shown in red, blue, green, and white, respectively. Each excursion chain included five successive SA and MD blocks wi temperatures of 700, 600, 500, 400, and 350 K for the SA blocks. The relaxa each MD block was set to 120 ps and the last 100 ps were used to generate r structures. In the next step, an average (mean) structure over the MD trajec tained, and the root mean deviation (RMSD) was used as a criterion to ide figuration in the MD trajectory that is structurally closest to the average s selected structure was then energy minimized using the conjugate gradient used as a representative structure. As a result, for each MD block (five bloc cursion chain) one representative structure was derived. The final structur cluded a total of 10,000 representative structures. Scheme 1 summarizes th For more technical details and a complete description of the BES protocol se For this study, we selected five main catechins found in green tea: (1) (+) (2) (-)-epicatechin (EC), (3) (-)-epigallocatechin (EGC), (4) (-)-epicatechin-3-g and (5) (-)-epigallocatechin-3-gallate (EGCG). Their chemical structures are s Each excursion chain included five successive SA and MD blocks with maximum temperatures of 700, 600, 500, 400, and 350 K for the SA blocks. The relaxation time for each MD block was set to 120 ps and the last 100 ps were used to generate representative structures. In the next step, an average (mean) structure over the MD trajectory was obtained, and the root mean deviation (RMSD) was used as a criterion to identify the configuration in the MD trajectory that is structurally closest to the average structure. The selected structure was then energy minimized using the conjugate gradient method and used as a representative structure. As a result, for each MD block (five blocks in each excursion chain) one representative structure was derived. The final structural library included a total of 10,000 representative structures. Scheme 1 summarizes the procedure. For more technical details and a complete description of the BES protocol see [75,76].

Docking Setup
Docking simulations were performed with AutoDock Vina (version 1.1.2) software [111]. The docking search space for exploring ligand binding conformations around each representative Aβ42 structure was defined using a rectangular box centered at the cente of mass of Aβ42 with a minimal distance of 12 Å from Aβ42 to the edges of the box. Therefore depending on the size and shape of Aβ42 configuration, an optimized docking box was determined individually for each Aβ42. Each docking run generates nine optimal Aβ42-lig and bound conformations and overall, a total of 90,000 (10,000 × 9) poses were generated for each catechin compound. The different poses in each run are rank-ordered by the Vina score, a quantity that correlates with the binding free energy. The top-scoring pose in each run achieves the lowest free energy of binding in the complex.
Very recently, it has been shown that the correct pose (the pose with the lowest RMSD from the corresponding experimental pose) is usually predicted by Vina but sometimes does not get the top score in the Vina ranking [112,113]. To avoid the problem and to capture the correct poses, it is recommended that except for the top-ranked pose, some important lower-ranked poses for each docking run should be identified and selected fo post-docking analysis. For more discussions about ranking, see also [114][115][116][117]. For this purpose, the differences in the binding free energies between the top-ranked pose and lower-ranked poses were calculated for selected high-ranked modelled complexes fo each docking run, ΔΔGbinding  Figure 8 and their initial structures were taken from the ZINC database [108]. This was followed by optimization of the molecular geometries for all of them using the B3LYP exchange and correlation functional [109] and the 6-31+G(d,p) basis set. The GAMESS package [110] was used for geometry optimization.

Docking Setup
Docking simulations were performed with AutoDock Vina (version 1.1.2) software [111]. The docking search space for exploring ligand binding conformations around each representative Aβ 42 structure was defined using a rectangular box centered at the center of mass of Aβ 42 with a minimal distance of 12 Å from Aβ 42 to the edges of the box. Therefore, depending on the size and shape of Aβ 42 configuration, an optimized docking box was determined individually for each Aβ 42 . Each docking run generates nine optimal Aβ 42 -ligand bound conformations and overall, a total of 90,000 (10,000 × 9) poses were generated for each catechin compound. The different poses in each run are rank-ordered by the Vina score, a quantity that correlates with the binding free energy. The top-scoring pose in each run achieves the lowest free energy of binding in the complex.
Very recently, it has been shown that the correct pose (the pose with the lowest RMSD from the corresponding experimental pose) is usually predicted by Vina but sometimes, does not get the top score in the Vina ranking [112,113]. To avoid the problem and to capture the correct poses, it is recommended that except for the top-ranked pose, some important lower-ranked poses for each docking run should be identified and selected for post-docking analysis. For more discussions about ranking, see also [114][115][116][117]. For this purpose, the differences in the binding free energies between the top-ranked pose and lower-ranked poses were calculated for selected high-ranked modelled complexes for each docking run, ∆∆G binding (= ∆G top pose -∆G lower-ranked pose ). Different cutoff values for the ∆∆G binding threshold (0.1, 0.2 and 0.3 kcal.mol −1 ) were used for selecting docking complexes, since the optimal selection of complexes is not known. For larger ∆∆G binding cutoff values, more complexes were selected. For example, with the cutoff value of 0.1 kcal.mol −1 , 17,431 (10,000 × 1 (top-ranked poses from each run) + 7431 (lower-ranked poses near top poses with the cutoff)) complexes were selected for docking of EGCG, while 29,923 and 42,833 complexes were selected with the cutoff values of 0.2 and 0.3 kcal.mol −1 , respectively. The number of selected docking complexes for all ligands and the corresponding cutoff values of ∆∆G binding are provided and labelled as "set-n" in Table 4. Since the results for different sets are very similar, we only present the results of set-1 (with the cutoff value of 0.1) and the results for the other two sets can be found in the Supporting Information. It should be emphasized that we employed the Vina ranking score just for selecting highranked docking poses for the structural analyses. The binding energies associated with these poses do not provide further insights. the results for the other two sets can be found in the Supporting Information. It should be emphasized that we employed the Vina ranking score just for selecting high-ranked docking poses for the structural analyses. The binding energies associated with these poses do not provide further insights.

MD Simulation Setup
All-atom MD simulations were performed on the peptide with five different ligands. The force field parameters for each ligand were created using the Antechamber program in the Ambertools19 package [118] and described by the General Amber Force Field (GAFF) [119] using AM1-BCC charges [120]. The Amberff99SB*-ILDNP force field [121] and the TIP3P water model [122] were adopted for the protein and water, respectively. Each protein was placed in a dodecahedral box such that the distance from the edges of the box to every atom in the protein was at least 1 nm and 150 mM of KCl was added to reproduce physiological conditions. Overall charge neutrality was preserved by adding 3 K + counterions. The GROMACS 2016.3 [123]. package was used for all simulations. Each system was energy minimized using the method of steepest descents. This was followed by a pre-equilibration in the canonical ensemble, i.e., at constant particle number, volume, and temperature, for 100 ps. The Lennard-Jones potential was truncated using a shift function between 1.0 and 1.2 nm. Electrostatic interactions were calculated using the particle-mesh Ewald method (PME) [124,125] with a real space cutoff of 1.2 nm. The temperature was set to 310 K with the V-rescale algorithm [126] and pressure was kept at 1 atm using the Parrinello-Rahman barostat [127]. Bonds involving hydrogens were constrained using the linear constraint solver (P-LINCS) algorithm [128]. Pre-equilibration was followed by a production run of 3 µs with a time step of 2 fs for each of the five peptide-ligand systems.

Conclusions
In this work, binding of various well-known catechins present in green tea to the amyloid-β peptide (Aβ) has been predicted and analyzed. For this purpose, a computational pipeline in the framework of the ensemble docking strategy has been proposed in which a structurally heterogeneous ensemble of conformations of Aβ 42 is used. The ensemble is generated by the Blockwise Excursion Sampling (BES) protocol [75,76] in which the conformational sampling is performed on the basis of many uncorrelated short-time MD simulations starting from different reasonable points of the accessible phase space.
It was observed that all green tea catechins compounds tended to interact with the aromatic residues through stacking and/or T-shaped interactions and, because of this, all compounds show a high tendency to interact with the hydrophobic region of Aβ 42 spanning residues from Tyr10 to Phe20, the region with the highest number of the aromatic residues in full-length Aβ 42 . This region also encompasses the central hydrophobic core (CHC, residues [16][17][18][19][20] that, based on many experimental and computational studies, plays a key role in the aggregation process of Aβ 42 . Therefore, the docking results indicate that all studied ligands, especially EGCG, can act as potent inhibitors against amyloid aggregation by blocking the central hydrophobic core. Additionally, it has been suggested that both hydrophobic aromatic interactions and hydrogen bonding are crucial for the binding of catechins to Aβ 42 . To evaluate the obtained findings in binding of catechin compounds to Aβ 42 , long multi-microsecond MD simulations were performed. It was shown that the present docking protocol is highly successful in identifying catechins' binding sites in monomeric Aβ 42 , in agreement with previous MD simulations and some recent experimental observations for similar Aβ 42 -catechin complexes [28,[77][78][79][83][84][85]87,93]. Finally, we suggested that our proposed pipeline with low computational cost in comparison with MD simulations is a suitable approach for high-throughput structure-based virtual screening of ligand libraries against the intrinsically disordered proteins (IDPs), such as Aβ. The execution of our proposed docking protocol for each ligand took up to a week using a standard Intel Core i7 desktop computer, while MD simulations for each ligand required, on average, approximately six weeks on a single compute node of the Compute Canada clusters, containing 24 CPU cores and 4 NVIDIA Tesla P100 GPUs.

Conflicts of Interest:
The authors declare no conflict of interest.